Costa Mesa
Is the U.S. Ready for the Next War?
Late this spring, I was led into a car in Kyiv, blindfolded, and driven to a secret factory in western Ukraine. The facility belongs to TAF Drones, founded three years ago by Oleksandr Yakovenko, a young Ukrainian businessman who wanted to help fend off the Russian invasion. When the war started, Yakovenko was busy running a logistics company in Odesa, but his country needed all the help it could get. Ukraine was overmatched--fighting a larger, wealthier adversary with a bigger army and more sophisticated weapons. "The government said to me, 'We need you to make drones,' " Yakovenko told me.
- Asia > Russia (0.37)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.25)
- Europe > Russia (0.06)
- (5 more...)
- Government > Military (1.00)
- Government > Regional Government (0.89)
Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset
Agrawal, Vasu, Akinyemi, Akinniyi, Alvero, Kathryn, Behrooz, Morteza, Buffalini, Julia, Carlucci, Fabio Maria, Chen, Joy, Chen, Junming, Chen, Zhang, Cheng, Shiyang, Chowdary, Praveen, Chuang, Joe, D'Avirro, Antony, Daly, Jon, Dong, Ning, Duppenthaler, Mark, Gao, Cynthia, Girard, Jeff, Gleize, Martin, Gomez, Sahir, Gong, Hongyu, Govindarajan, Srivathsan, Han, Brandon, He, Sen, Hernandez, Denise, Hristov, Yordan, Huang, Rongjie, Inaguma, Hirofumi, Jain, Somya, Janardhan, Raj, Jia, Qingyao, Klaiber, Christopher, Kovachev, Dejan, Kumar, Moneish, Li, Hang, Li, Yilei, Litvin, Pavel, Liu, Wei, Ma, Guangyao, Ma, Jing, Ma, Martin, Ma, Xutai, Mantovani, Lucas, Miglani, Sagar, Mohan, Sreyas, Morency, Louis-Philippe, Ng, Evonne, Ng, Kam-Woh, Nguyen, Tu Anh, Oberai, Amia, Peloquin, Benjamin, Pino, Juan, Popovic, Jovan, Poursaeed, Omid, Prada, Fabian, Rakotoarison, Alice, Ranjan, Rakesh, Richard, Alexander, Ropers, Christophe, Saleem, Safiyyah, Sharma, Vasu, Shcherbyna, Alex, Shen, Jia, Shen, Jie, Stathopoulos, Anastasis, Sun, Anna, Tomasello, Paden, Tran, Tuan, Turkatenko, Arina, Wan, Bo, Wang, Chao, Wang, Jeff, Williamson, Mary, Wood, Carleigh, Xiang, Tao, Yang, Yilin, Yao, Julien, Zhang, Chen, Zhang, Jiemin, Zhang, Xinyue, Zheng, Jason, Zhyzheria, Pavlo, Zikes, Jan, Zollhoefer, Michael
Human communication involves a complex interplay of verbal and nonverbal signals, essential for conveying meaning and achieving interpersonal goals. To develop socially intelligent AI technologies, it is crucial to develop models that can both comprehend and generate dyadic behavioral dynamics. To this end, we introduce the Seamless Interaction Dataset, a large-scale collection of over 4,000 hours of face-to-face interaction footage from over 4,000 participants in diverse contexts. This dataset enables the development of AI technologies that understand dyadic embodied dynamics, unlocking breakthroughs in virtual agents, telepresence experiences, and multimodal content analysis tools. We also develop a suite of models that utilize the dataset to generate dyadic motion gestures and facial expressions aligned with human speech. These models can take as input both the speech and visual behavior of their interlocutors. We present a variant with speech from an LLM model and integrations with 2D and 3D rendering methods, bringing us closer to interactive virtual agents. Additionally, we describe controllable variants of our motion models that can adapt emotional responses and expressivity levels, as well as generating more semantically-relevant gestures. Finally, we discuss methods for assessing the quality of these dyadic motion models, which are demonstrating the potential for more intuitive and responsive human-AI interactions.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- (20 more...)
- Information Technology > Security & Privacy (1.00)
- Government (0.92)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.87)
Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning
Wang, Haining, Clark, Jason, McKelvey, Hannah, Sterman, Leila, Gao, Zheng, Tian, Zuoyu, Kübler, Sandra, Liu, Xiaozhong
A vast amount of scholarly work is published daily, yet much of it remains inaccessible to the general public due to dense jargon and complex language. To address this challenge in science communication, we introduce a reinforcement learning framework that fine-tunes a language model to rewrite scholarly abstracts into more comprehensible versions. Guided by a carefully balanced combination of word- and sentence-level accessibility rewards, our language model effectively substitutes technical terms with more accessible alternatives, a task which models supervised fine-tuned or guided by conventional readability measures struggle to accomplish. Our best model adjusts the readability level of scholarly abstracts by approximately six U.S. grade levels -- in other words, from a postgraduate to a high school level. This translates to roughly a 90% relative boost over the supervised fine-tuning baseline, all while maintaining factual accuracy and high-quality language. An in-depth analysis of our approach shows that balanced rewards lead to systematic modifications in the base model, likely contributing to smoother optimization and superior performance. We envision this work as a step toward bridging the gap between scholarly research and the general public, particularly younger readers and those without a college degree.
- North America > United States > Montana > Gallatin County > Bozeman (0.04)
- North America > United States > Indiana > Monroe County > Bloomington (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- (11 more...)
- Health & Medicine (1.00)
- Education > Educational Setting > K-12 Education (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)
Palmer Luckey Is Bringing Anduril Smarts to Microsoft's Military Headset
Palmer Luckey Is Bringing Anduril Smarts to Microsoft's Military Headset The founder of Oculus VR is returning to headsets--this time for the battlefield. When Palmer Luckey was hacking together virtual reality headsets at his startup Oculus VR in the mid-2010s, he would sometimes imagine a future in which US soldiers used the technology to sharpen their battlefield senses. That vision is now virtually a reality after a deal that will bring software from his defense startup, Anduril, to a US Army head-mounted display developed by Microsoft. "The idea is to enhance soldiers," Luckey tells WIRED over Zoom from his home in Newport Beach, California. "Their visual perception, audible perception--basically to give them all the vision that Superman has, and then some, and make them more lethal."
- North America > United States > California > Orange County > Newport Beach (0.25)
- South America (0.05)
- North America > United States > Virginia (0.05)
- (5 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military > Army (1.00)
- Information Technology > Hardware (1.00)
- Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Qin, Yulei, Yang, Yuncheng, Guo, Pengcheng, Li, Gang, Shao, Hang, Shi, Yuchen, Xu, Zihan, Gu, Yun, Li, Ke, Sun, Xing
Instruction tuning plays a critical role in aligning large language models (LLMs) with human preference. Despite the vast amount of open instruction datasets, naively training a LLM on all existing instructions may not be optimal and practical. To pinpoint the most beneficial datapoints, data assessment and selection methods have been proposed in the fields of natural language processing (NLP) and deep learning. However, under the context of instruction tuning, there still exists a gap in knowledge on what kind of data evaluation metrics can be employed and how they can be integrated into the selection mechanism. To bridge this gap, we present a comprehensive review on existing literature of data assessment and selection especially for instruction tuning of LLMs. We systematically categorize all applicable methods into quality-based, diversity-based, and importance-based ones where a unified, fine-grained taxonomy is structured. For each category, representative methods are elaborated to describe the landscape of relevant research. In addition, comparison between latest methods is conducted on their officially reported results to provide in-depth discussions on their limitations. Finally, we summarize the open challenges and propose the promosing avenues for future studies. All related contents are available at https://github.com/yuleiqin/fantastic-data-engineering.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (14 more...)
- Overview (1.00)
- Research Report > New Finding (0.45)
A Telerehabilitation System for the Selection, Evaluation and Remote Management of Therapies
Anton, David, Berges, Idoia, Bermúdez, Jesús, Goñi, Alfredo, Illarramendi, Arantza
Telerehabilitation systems that support physical therapy sessions anywhere can help save healthcare costs while also improving the quality of life of the users that need rehabilitation. The main contribution of this paper is to present, as a whole, all the features supported by the innovative Kinect-based Telerehabilitation System (KiReS). In addition to the functionalities provided by current systems, it handles two new ones that could be incorporated into them, in order to give a step forward towards a new generation of telerehabilitation systems. The knowledge extraction functionality handles knowledge about the physical therapy record of patients and treatment protocols described in an ontology, named TRHONT, to select the adequate exercises for the rehabilitation of patients. The teleimmersion functionality provides a convenient, effective and user-friendly experience when performing the telerehabilitation, through a two-way real-time multimedia communication. The ontology contains about 2300 classes and 100 properties, and the system allows a reliable transmission of Kinect video depth, audio and skeleton data, being able to adapt to various network conditions. Moreover, the system has been tested with patients who suffered from shoulder disorders or total hip replacement.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- (14 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)
From Ultra-Fine to Fine: Fine-tuning Ultra-Fine Entity Typing Models to Fine-grained
For the task of fine-grained entity typing (FET), due to the use of a large number of entity types, it is usually considered too costly to manually annotating a training dataset that contains an ample number of examples for each type. A common way to address this problem is to use distantly annotated training data that contains incorrect labels. However, the performance of models trained solely with such data can be limited by the errors in the automatic annotation. Recently, there are a few approaches that no longer follow this conventional way. But without using sufficient direct entity typing supervision may also cause them to yield inferior performance. In this paper, we propose a new approach that can avoid the need of creating distantly labeled data whenever there is a new type schema. We first train an entity typing model that have an extremely board type coverage by using the ultra-fine entity typing data. Then, when there is a need to produce a model for a newly designed fine-grained entity type schema. We can simply fine-tune the previously trained model with a small number of examples annotated under this schema. Experimental results show that our approach achieves outstanding performance for FET under the few-shot setting. It can also outperform state-of-the-art weak supervision based methods after fine-tuning the model with only a small size manually annotated training set.
- North America > United States > Tennessee > Shelby County > Memphis (0.04)
- North America > United States > California > Orange County > Costa Mesa (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
Detecting The Corruption Of Online Questionnaires By Artificial Intelligence
Lebrun, Benjamin, Temtsin, Sharon, Vonasch, Andrew, Bartneck, Christoph
Online questionnaires that use crowd-sourcing platforms to recruit participants have become commonplace, due to their ease of use and low costs. Artificial Intelligence (AI) based Large Language Models (LLM) have made it easy for bad actors to automatically fill in online forms, including generating meaningful text for open-ended tasks. These technological advances threaten the data quality for studies that use online questionnaires. This study tested if text generated by an AI for the purpose of an online study can be detected by both humans and automatic AI detection systems. While humans were able to correctly identify authorship of text above chance level (76 percent accuracy), their performance was still below what would be required to ensure satisfactory data quality. Researchers currently have to rely on the disinterest of bad actors to successfully use open-ended responses as a useful tool for ensuring data quality. Automatic AI detection systems are currently completely unusable. If AIs become too prevalent in submitting responses then the costs associated with detecting fraudulent submissions will outweigh the benefits of online questionnaires. Individual attention checks will no longer be a sufficient tool to ensure good data quality. This problem can only be systematically addressed by crowd-sourcing platforms. They cannot rely on automatic AI detection systems and it is unclear how they can ensure data quality for their paying clients.
- North America > United States > New York > New York County > New York City (0.04)
- Oceania > New Zealand (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.45)
- Education > Educational Setting (0.45)
Towards More Human-like AI Communication: A Review of Emergent Communication Research
In the initial phase of AI research following the second AI winter, the focus was on identifying new areas where AI could outperform humans, with famous examples including chess [Silver et al., 2018], Go [Silver et al., 2016], and Starcraft [Vinyals et al., 2019]. While this was a limited application to games, it set the tone for research to prioritize building AI agents with superhuman capabilities. However, over the last decade, the research community has witnessed a shift towards a human-centric approach that aims to leverage AI to aid humans in everyday tasks and relieve them of repetitive duties [Xu, 2019, Riedl, 2019, Shneiderman, 2021]. The interaction between humans and machines is a crucial aspect of human-centric AI [Mikolov et al., 2016], and it should take place in domains where humans are already familiar and require little to no training. Therefore, applications that involve niche practices, such as coding and mathematics, should be avoided in favor of language-based applications. In particular, human-machine communication should be grounded in natural language, which presents the challenge of teaching artificial agents to communicate in multiple languages. Recent advances in natural language processing (NLP) have led to the emergence of the transformer architecture [Vaswani et al., 2017], which has become the preferred approach for language-based applications, as exemplified by Language Models (LMs) such as GPT3 [Brown et al., 2020], LLaMA [Touvron et al., 2023], and Lamda [Thoppilan et al., 2022]. One of the challenges for language model architectures is their focus on predicting the next word in a sentence rather than comprehending the broader context and purpose of language usage. While humans use language as a tool for coordination and communication to thrive in a shared environment, artificial intelligence may struggle to understand the subtleties and complexities of language fully.
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- Europe > France (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (24 more...)
- Health & Medicine > Therapeutic Area > Neurology (0.92)
- Leisure & Entertainment > Games > Computer Games (0.88)
- Education > Curriculum > Subject-Specific Education (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Data Scientist (Remote USA ONLY) at Experian - Costa Mesa, CA, United States
Experian is the world's leading global information services company. During life's big moments – from buying a home or a car, to sending a child to college, to growing a business by connecting with new customers – we empower consumers and our clients to manage their data with confidence. We help individuals to take financial control and access financial services, businesses to make smarter decisions and thrive, lenders to lend more responsibly, and organizations to prevent identity fraud and crime. We have 20,000 people operating across 44 countries and every day we're investing in new technologies, talented people, and innovation to help all our clients maximize every opportunity We are thrilled to share that FORTUNE has named Experian one of the 100 Best Companies to Work For. In addition, for the last five years we've been name in the top 100 "World's Most Innovative Companies" by Forbes Magazine.
- Information Technology > Data Science (0.40)
- Information Technology > Artificial Intelligence (0.40)